100 research outputs found
Unsupervised Learning of Depth and Ego-Motion from Video
We present an unsupervised learning framework for the task of monocular depth
and camera motion estimation from unstructured video sequences. We achieve this
by simultaneously training depth and camera pose estimation networks using the
task of view synthesis as the supervisory signal. The networks are thus coupled
via the view synthesis objective during training, but can be applied
independently at test time. Empirical evaluation on the KITTI dataset
demonstrates the effectiveness of our approach: 1) monocular depth performing
comparably with supervised methods that use either ground-truth pose or depth
for training, and 2) pose estimation performing favorably with established SLAM
systems under comparable input settings.Comment: Accepted to CVPR 2017. Project webpage:
https://people.eecs.berkeley.edu/~tinghuiz/projects/SfMLearner
Leveraging Vision Reconstruction Pipelines for Satellite Imagery
Reconstructing 3D geometry from satellite imagery is an important topic of
research. However, disparities exist between how this 3D reconstruction problem
is handled in the remote sensing context and how multi-view reconstruction
pipelines have been developed in the computer vision community. In this paper,
we explore whether state-of-the-art reconstruction pipelines from the vision
community can be applied to the satellite imagery. Along the way, we address
several challenges adapting vision-based structure from motion and multi-view
stereo methods. We show that vision pipelines can offer competitive speed and
accuracy in the satellite context.Comment: Project Page: https://kai-46.github.io/VisSat
NYC3DCars: A Dataset of 3D Vehicles in Geographic Context
Geometry and geography can play an important role in recognition tasks in computer vision. To aid in study-ing connections between geometry and recognition, we in-troduce NYC3DCars, a rich dataset for vehicle detection in urban scenes built from Internet photos drawn from the wild, focused on densely trafficked areas of New York City. Our dataset is augmented with detailed geometric and ge-ographic information, including full camera poses derived from structure from motion, 3D vehicle annotations, and geographic information from open resources, including road segmentations and directions of travel. NYC3DCars can be used to study new questions about using geometric in-formation in detection tasks, and to explore applications of Internet photos in understanding cities. To demonstrate the utility of our data, we evaluate the use of the geographic in-formation in our dataset to enhance a parts-based detection method, and suggest other avenues for future exploration. 1
- …